196        Bioinformatics

perform differential analysis. Since our primary goal is to study the difference between the

normal and tumor samples, we can construct a contrast using “makeContrasts” function

and then we can conduct the statistical test using the “glmQLFTest” function as follows:

my.contrasts

<- makeContrasts(conditiontumo-conditionnorm,levels=design)

fitq <- glmQLFit(yNorm, design)

qlfq<- glmQLFTest(fitq,contrast=my.contrasts)

topTags(qlfq, n=10, adjust.method=”BH”, sort.by=”PValue”,

p.value=0.05)

The “qlfq” is a DGELRT object that stores the results of a GLM-based differential expres-

sion analysis for DGE data. The “topTags” function prints the top ten (n= 10) set of the

most significantly differential genes as shown in Figure 5.22. The p-value threshold is set

to “p.value=0.05” so only genes with p-value less than 0.05 will be listed. The negative log-

fold changes (logFC) represent genes that are downregulated (down-expressed) in tumor

samples over normal sample; logCPM is the log count-per-million; F is the test statistic

for the null hypothesis that no difference in the gene expression between the normal and

tumor samples; pvalue is the significance measure (p-value < 0.05 is significant); and FDR

is the false discovery rate.

To use GLM negative binomial model instead of the quasi-negative binomial model for

the differential expression, you can use the following script:

FIGURE 5.22  The top ten significantly expressed genes.